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Abstract 

Network epidemiology often assumes that the relationships defining 
the social network of a population are static. The dynamics of relation- 
ships is only taken indirectly into account, by assuming that the rele- 
vant information to study epidemic spread is encoded in the network ob- 
tained by considering numbers of partners accumulated over periods of 
time roughly proportional to the infectious period of the disease at hand. 
On the other hand, models explicitly including social dynamics are often 
too schematic to provide a reasonable representation of a real popula- 
tion, or so detailed that no general conclusions can be drawn from them. 
Here we present a model of social dynamics that is general enough that 
its parameters can be obtained by fitting data from surveys about sex- 
ual behaviour, but that can still be studied analytically, using mean field 
techniques. This allows us to obtain some general results about epidemic 
spreading. We show that using accumulated network data to estimate the 
static epidemic threshold leads to a significant underestimation of it. We 
also show that, for a dynamic network, the relative epidemic threshold 
is an increasing function of the infectious period of the disease, implying 
that the static value is a lower bound to the real threshold. 

1 Introduction 

Even though the aim of mathematical modelling in epidemiology has always 
been to help predicting the patterns of spread of infectious diseases, the com- 
plexity of real populations has always constrained modellers to use strong as- 
sumptions. Even though these do not always guarantee the existence of analytic 
solutions, at least the models become tractable. On the other hand, the search 
for analitical simplicity, or beauty, has sometimes taken over more practical 
considerations. 

One of the strongest assumptions used in most epidemiological models is 
the Law of mass action [T]. First proposed by chemists, it postulates that in 
dynamical equilibrium the rate of a chemical reaction is proportional to the con- 
centrations of the reactants, and can be derived from the probability of collision 



between reacting molecules. The analogy between the movements of molecules 
and living beings, drawn almost a century ago [1], leads to the epidemiological 
version of this postulate: the 'force of infection' is proportional to the densities 
of infected and uninfected individuals (called 'susceptibles' in the epidemiolog- 
ical literature). It implies assuming that the population has no structure, i.e. 
that every person can be in contact with every other ('random mixing'). 

In general, however, members of a population interact only with a very 
small subset of it. Thus, one way to go beyond the random mixing assumption 
is to consider that the members of the population form a social network. Its 
definition depends strongly on the type of interaction necessary to transmit the 
disease whose spread is being modelled. The advantage of this over the random 
mixing approach is that models can be better adapted to specific populations. 
Needless to say, this implies having more data about the social structure, as well 
as new concepts and tools to analyse them. Fortunately, these are provided by 
Social Network Analysis, a field that has developed rapidly in recent years [2]. 
The mathematics are not as straightforward as in the analysis of mass-action 
models, but for some cases some interesting results can be obtained by using 
approximations (some of them derived from statistical physics). One example 
is the simple relationship that exists for a disease with infectivity A and an 
infectious period a~^, between the relative epidemic threshold Ac — Ac/a, and 
the topological properties of the network [3J S] : 



where (x) is the mean of the degree distribution of the social network, and (x^) 
is its variance. 

Network epidemiology seems particularly well suited for the analysis of the 
spread of sexually transmitted diseases, as the definition of the network in this 
case is more straightforward (although not free of problems, see [5]). The large 
number of surveys of sexual behaviour carried out in the last three decades 
has provided an invaluable resource for modellers. Interestingly, one common 
feature of many sexual networks built from survey data is that their degree 
distribution has a very long tail: there exist a small number of individuals who 
have a very large number of sexual contacts. Mathematically, this means that, 
even though (x) is rather small (typically less than 3), (a;^) can be very large. 
Applying Eq. ([T|) to such networks (what, as explained below, is not altogether 
correct) would lead to the conclusion that, for those populations, even diseases 
with very low infectivity can trigger an epidemic. It has even been argued that 
some sexual networks have power law degree distributions with infinite variance 
[6l [7], which would imply a vanishing epidemic threshold, but there is some 
controversy about this 8 . 

One aspect that is usually disregarded in the network approach is the dy- 
namic nature of social interactions. It is reasonable to assume that this dynamics 
produces a steady-state, in which the distribution of contacts does not change, 
even though at all times individuals are free to end their existing relationships 
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and create new ones. Eq. ((T]) is derived for a static network, and is sometimes 
used to estimate the epidemic threshold of populations whose structure is de- 
duced from sexual behaviour surveys. Respondents to these surveys, however, 
are usually asked about number of partners over a certain time period, and the 
distribution thus obtained is often used as a proxy for the steady state, or in- 
stantaneous distribution. But it is difficult to ascertain how close distributions 
of accumulated contacts can be to the instantaneous distribution [9] . It is often 
suggested that if the time period asked about in the survey is similar to the 
infectivity period of the disease analysed, epidemic thresholds can be calculated 
by using the proxy network (see for example [TUl IHl HI])- But in general this 
argument remains at a qualitative level. In principle, it should be possible to 
see whether the dynamics affects the spread of the disease only by generating a 
steady state distribution or there are other effects independent of this. 

Models that take into account the dynamic nature of social network usu- 
ally consider that the formation and dissolution of links between individuals 
are stochastic processes [T^]. More recently, such models have also been used 
to understand the spread of infectious diseases [T31 [HI [ISl [Ml [13 [IB]- But, in 
general, the additional complication of dealing with network dynamics has led 
either to models that have analytical solutions but that are too simple to be 
applied in a realistic setting, or to models that rely exclusively on numerical 
simulations, from which it is difficult to draw general conclusions. The model 
of network dynamics presented in the next section is an attempt at overcoming 
these limitations. It can be tailored to give similar accumulated degree distri- 
butions to those obtained in real surveys, as shown in the third section, but it 
also allows us to obtain some very general analytical results for the influence 
of network dynamics on the propagation of infectious diseases, using mean field 
techniques. 

2 Model 

We consider a population of N individuals epidemiologically identical. As in 
this case it has been shown that static models with individuals placed on a 
bipartite network give identical predictions to models where the population is 
not divided into two groups [9], we have assumed that partnerships can be 
established between any two individuals. Thus, even though our model applies 
strictly only to homosexual populations, its predictions should be qualitatively 
correct for heterosexual populations with similar epidemiological variables for 
both sexes. 

Partnerhips can be established and dissolved with a rate that depends on 
features of the two individuals. As the only dynamic attribute we consider is 
the number of partners, we first assume that rates depend only on it. Thus, the 
rate of partnership creation between individuals i and j is p{ki,kj,t) and the 
rate of partnership dissolution is cr(fc.i, kj, t), where ki and kj are the number of 
current sexual partners of i and j at time t. As we only deal with steady states, 
hereafter the t dependence is dropped from all quantities. 
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In equilibrium, the master equation for the steady state degree distribution 
P{k) becomes: 



= ^{N-l-k)P{k-l)pk-i-(k + l)P{k + l)ak+i 

+P{k){N -l~k)pk + kP{k)ak (2) 

where pk = {p{k,ki))i is the average probability that an individual with k 
partners gets a new partner and Ok ~ {cr{k, ki))i is the average probability that 
an individual breaks one of his existing relationships. In principle, the link 
creation probability should be averaged only over those individuals that are not 
current partners of the individual. However, as in real populations k is much 
smaller than N , this quantity is very well approximated by the average over the 
entire population: 

N N 

{p{k, ki))i - p{k, h) = p^^^ ^')^(^') (3) 

1=1 ki 

For the link dissolution probability, the distribution that should be used 
to calculate the average is P(fci|fc), the degree distribution of the individuals 
that are connected to an individual having k partners. However, if we assume 
that the dynamics does not generate a significant assortative mixing by degree, 
P{ki\k) can be written as P{ki\k) = ki P{ki)/{k). This is not a too stringent 
assumption, since there seems to be no definite tendency in mixing with respect 
to sexual activity: some sexual networks have been found to be weakly assorta- 
tive [in], some neutral (20j and some disassortative [H]. The resulting average 
link dissolution is, then, 

(a(fc, ki))i = Y ^(^' ^')^' Piki)/{k). (4) 
I 

Solving Eq. ^ gives the steady state degree distribution: 



i=0 ^ 



for k > 0. P(0) is obtained by normalizing the distribution. P{k) can also be 
written as 

P{k; ,xn) = ^(^7^) n X. (6) 



i=0 



where the N parameters Xi {i = 0, ■ ■ ■ , N ~ 1) are obtained by solving the N 
self-consistency equations 

^ ^ {k)T,iP{i,l)P{l;xQ, - ■ ■ ,xn) 
J2icr{i,l)l P{l;xo, - ■ ■ ,xn) 
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If a model is to be used for understanding the spread of a disease in a real 
population, its parameters should be adjusted by comparing with the available 
population data. For simpler models, it has been suggested that this could 
be done by using an empirical instantaneous distribution |16j . In our model, 
however Eqs. ^ and ([7]) show that rescaling the link creation and dissolution 
functions does not change the equilibrium distribution. This was to be ex- 
pected, because changing the time scale cannot change the nature of the steady 
state reached. Thus, time scales should be obtained from other population 
measurements. An important problem of this approach is that, unfortunately, 
information about instantaneous degree distributions is usually not available. 
Instead, almost all surveys ask respondents about the number of sexual con- 
tacts accumulated over a certain time period. Thus, what we need to know 
from the model is the distribution of accumulated contacts (i.e. the probabil- 
ity of having had k contacts during a given time period) , Pt {k) , which can be 
written as 

k 

PT{k)^Y.PT{k-k'\k') (8) 

where Pt(^ ~ k\ k') is the probability of having k ~ k' new contacts over a time 
period of length T, conditional on having k' partners at the beginning of that 
period. The equations that these conditional probabilities satisfy are 

pT{m\n) = pn[PT{m~l\n+l) - PT{m\n)] + 

nan [PT{'m\n - 1) - PT{'m\n)] (9) 

for < TO, n < iV — 1, with pN-i = and ctq = 0. With the aid of some mathe- 
matical software, such as Mathematica or Matlab, this recursion can be solved 
exactly, for any desired value of T (see Appendix). Using this, the parameters 
p and a can be adjusted to fit the distributions obtained in any given survey. 
An example of this is given in the next section. 

3 Application examples 

The number of self consistency equations to be solved (Eqs. [7]) imposes a practi- 
cal constraint on the models that can be effectively analized. One of the simplest 
ways to reduce the number of equations to only one is to consider functions of 
the form p{ki, kj) = p{ki)p{kj) and cr(fci, kj) = a{ki)a(kj). This choice has the 
added advantage of ensuring that there is no assortative mixing by degree. Note 
that if p{k) is an increasing function of fc, individuals with many partners are 
more likely to attract new ones. This is usually known as preferential attach- 
ment in the network literature [22 . Interestingly it has been shown that this is 
likely to play a role in the formation of sexual networks [23] . 

First we analyze two different models, called A and B, that generate almost 
the same instantaneous degree distribution. Model A is defined by the functions 
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p{k) = CAk^/ik + 1)^ (for k > 0), p(0) = 1, and cr(fc) = 1, whereas model B is 
defined by p{k) = Csk^/ik + l) (for k > 0), p(0) = 1, and cr(/c) = k. Ca and Cb 
are numerical constants. The instantaneous distribution is P{k) = P^OkX^ /k^ , 
where = ni=i(l ~ */^)- ^ obtained by solving the self consistency 
equation for each model. The constants Ca and Cb are adjusted to obtain 
a degree distribution that has a mean value of order 1, and a variance large 
enough to mimic the long tails observed in sexual networks. We find that there 
is a critical value for Ca and Cb below which the network is sparsely connected, 
and above which the network becomes dense, in the sense that each individual 
is connected to a significant fraction of the population (see Appendix). This 
is usually called a phase transition. Thus, to obtain a relatively wide degree 
distribution but keeping the network sparse, Ca and Cb were given values that 
are close to (but below) the critical value. 

Fig. 1 shows that the mean field approach is a very good approximation for 
the corresponding stochastic model, both for the instantaneous degree distribu- 
tion as well as for the accumulated ones. It also shows that, even for models 
with the same instantaneous degree distribution, the distribution of the number 
of accumulated partners can be rather different. As a consequence, the usual 
approach of fitting the tail of these distributions with a power law function 
would not give the same exponent for models A and B. The accumulated distri- 
butions can be used to calculate epidemic thresholds, using Eq. ([1]), which can 
be considered as approximations to A°, the static threshold. The inset shows 
that these approximations can be very different from the actual value of A^. 

To see whether these differences are relevant in a real setting, we have applied 
this model to data from the National Survey of Sexual Attitudes and Lifestyles II 
(NATSAL 2000), carried out in Britain in 2000-2001 |M1[2S]. Participants were 
asked about the number of male and female partners during several, overlapping, 
time periods previous to the survey: 1 month, 3 months, 1 year, and 5 years. 
From these data, one can build, for each time period, the distribution of the 
number of accumulated partners. 

Furthermore, we have only used the data related to homosexual men, since 
our model deals strictly with one-sex populations. However, as sexual orien- 
tation was not asked about to the participants of NATSAL, we have used a 
definition of MSM (men who have sex with men) as those men having reported 
at least one male partner within the five years prior to interview This 
leaves 166 out of 4762 male respondents. Because of recall problems, the ac- 
curacy of the reports decreases as the time period asked about increases p7] . 
This is already apparent in the data for 5 years (not shown), where there is 
substantial heaping. In our case, this data set is further skewed because it has 
been used to define MSM. Thus, we have adjusted our model to fit only the 
degree distributions for 1 month, 3 months and 1 year (see Appendix) . We have 
not used the data about lifetime number of partners, because the time periods 
involved were not the same for all participants (whose ages ranged from 16 to 
44 years), as assumed in our model. 

Fig. 2 shows the distribution of accumulated partners for the four time inter- 
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Figure 1: Distributions of number of partners accumulated over a time period 
T, for models A and B (see text). The full lines for T = are given by Eq. ([5]), 
whereas the other lines are obtained by solving recursively Eqs. ([9]). Symbols 
correspond to simulations for a system with 10000 individuals (averaged over 
100 runs). The symbols and lines falling on the left vertical axis represent the 
fraction of individuals having sexual partners. Error bars are smaller then 
the symbols. The inset shows the static epidemic threshold calculated for the 
distribution of accumulated partners for different time periods, for both models. 

vals analyzed. The fit is reasonably good for the three curves used. Even though 
the data for the 5 years period are overestimated, the tendency seems to be cor- 
rect. The inset shows the approximations to the static threshold, calculated 
using the model degree distributions for several time periods (see Appendix). 
As in the previous figure, the approximations get worse when calculated us- 
ing longer time periods. In fact, already the 1 month distribution leads to an 
underestimation of of about 50 %. 

To understand whether this underestimation is relevant, the spread of a 
disease should be analyzed taking into account the intrinsic dynamics of the 
network. The question is not only how close the real and static thresholds are, 
but even which one is larger, because it could happen that the real threshold 
was smaller than the static one, thus compensating for the underestimation of 
the approximations calculated with accumulated degree distributions. In the 
next section it is shown that this is not the case: real thresholds are always 
larger than static ones. 
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Figure 2; Cumulative distribution of the number of sexual partners accumulated 
over different periods of time for a population of homosexual men. Symbols 
correspond to data from the british National Survey of Sexual Attitudes and 
Lifestyles (NATSAL 2000). The lines joining the symbols are only guides to the 
eye. The full lines correspond to the fits of the epidemic model for each time 
period. The lowest dotted line is the prediction for the instantaneous cumulative 
degree distribution. The inset shows the values given by the model for the static 
epidemic threshold, calculated from the degree distribution for the time periods 
analyzed. 
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4 Epidemic spread 



We consider the propagation of a disease that can be cured, and that confers no 
immunity, i.e. individuals can be reinfected as soon as they become susceptible 
again. This type of models, called SIS, are considered acceptable models of 
sexually transmitted diseases as gonorrhea and chlamydia [5S] . 

It is assumed that, in an existing relationship between a susceptible and 
an infected individual, infection can pass with a probability A per unit time, 
and that infected individuals heal at a rate a. We also assume that the so- 
cial dynamics is not affected by the propagation of the disease. We need to 
calculate P^^k, I;t), the probability that at time t an agent x has k simultane- 
ous relationships and is infected. The master equation for this depends on the 
two point probabilities Pxy{kx, S; ky, /; t), which in turn depend on three-point 
probabilities, and so on. To get a closed system we choose the simplest ansatz: 
Pxyikx, S; ky, I;t) « Px{kx S;t)Py{ky I;t). Using this, and averaging over all 
agents with the same number of partners, k, the master equation for P{k,I) 
becomes 

{aI + Ae)Pi = kPeX (10) 

where A is a tridiagonal matrix defined by {Ag)iiJ^i = —i(7i, {Ag)ii = {N — 
ijPi-i + (* ~ + {i — \-)\9 and {Ag)i+ii ~ —{N — i)pi-i and the vectors 

Pi and fcP are given by {Pi)j — P{j,I) and {kP)j ~ j P{j)- P{j) is given by 
Eq. ([5]). is the probability of having an infected partner !4i , 9 = kPj/{k), and 
is obtained from the self consistency condition, 

k{al + Ag)-^kP = (11) 
A 

The epidemic threshold can now be easily obtained by taking the limit 0^0: 

Ae = ^ (12) 

k{aI + Ao)-^kP 
The fraction of infected individuals is 

ni = e\l{aI + Ag)-^kP (13) 

where 1 is the vector with all components set to 1. In the limit where the 
characteristic times of the disease are much shorter than the ones characterizing 
the social dynamics (i.e. A — ?> oo, a oo, but keeping X = X/a constant), the 
usual result for a static network is obtained (Eq. 1): A° = {k)/{k'^). Intuitively 
one can think that the disease spreads so fast that it 'sees' only the instantaneous 
network. The opposite limit can also be calculated (see Si text), giving A^ = 
{k)~'^. Thus in this case, the social dynamics is so fast that, in terms of disease 
spread, the network is equivalent to an 'average' network where all nodes have 
the same degree, (k). Note that A^ > ^c- It is interesting to note that the social 



9 



0.4 T 



0.5 T 



0.6 T 



0.7 T =■ 



0.9 r : 



0.8 



1.1 




0.3 



0.1 



0.01 



1 



10 



100 



1,000 



t| (months) 



Figure 3: Relative epidemic threshold as a function of the infectious period of 

the disease. The inset shows the epidemic threshold as a function of infectious 
period. Both curves were calculated using the model obtained by fitting the 
NATSAL data (NATSAL model). 

dynamics influences disease spread only through the instantaneous network of 
contacts, in the limit cases. 

Fig. 3 shows that the relative epidemic threshold of the NATSAL model 
is larger for diseases with larger infectious periods, t/ = a^^ . Note that for 
infectious periods of the order of a few months, as is the case of untreated 
gonorrhea, chlamydia and syphilis, the difference between the corresponding 
threshold and the static approximation, A^, can be significant. In terms of the 
nonnormalized epidemic threshold, the inset of Fig. 3 shows that when the 
dynamics of the network is taken into account, Ac decreases more slowly with 
ti. 

Interestingly, it can be proved (see Appendix) that the eflfect of the dy- 
namics is the same for all possible choices of the link creation and dissolution 
functions, p{ki,kj) and ij{ki,kj): the relative epidemic threshold always grows 
monotonously with ti. Even though the mean field approximation is not very 
good for sparse networks (as should be the case of most instantaneous sexual 
networks), it can be conjectured that the picture is not qualitatively different. 
This is supported by simulations carried out for the stochastic analog of the 
NATSAL model. Fig. 4 shows that the qualitative behavior of the simulation 
curves is well predicted by the mean field approximation. Note that the real 
epidemic threshold is even larger than the mean field value and therefore the 
underestimation mentioned before is even worse when compared with simulation 
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values. 

For large values of the infectivity, Fig. 4 shows that nj, the fraction of 
infected individuals in the endemic state, grows with tj. This too is a general 
feature of this kind of models. Interestingly, for very large A, nj does not tend 
to 1: 

lim ni = l , (14) 

In a static network (i.e. tj — >■ 0), the disease cannot reach isolated individuals. 
In the dynamic case, however, even momentarily isolated people get a partner 
after a time l/(po), on average. But there is a probability that isolated, infected 
people get cured before they get a partner. This ensures that there is always a 
fraction of the isolated individuals that is not infected, no matter how high the 
infectiousness of the disease is. The proportion of partners that are infected, 9, 
is also an increasing function of A but it tends to 1 for large infectivities, for all 
values of tj. It can also be proved that, for fixed values of A, is a decreasing 
function of tj. 



5 Including intrinsic features and neighbourhoods 

The model analyzed in the previous sections can be extended in many ways, in 
order to make it more realistic. One of them is to consider that the attraction 
between individuals can depend not only on the number of partners, which is a 
dynamical variable, but also on intrinsic features of each individual, called fitness 
in the network literature, that do not change over time (or at least over the times 
relevant for the problem). Many characteristics have been proposed to account 
for attraction, as beauty, talent, socioeconomical status, and even geographical 
location. The downside to this added realism is that such features are not easy 
to univocally define [32], let alone quantify. It is interesting, however, to see 
that some general properties can be derived for our model. 

We assume that the fitness / takes a finite number of values, whose proba- 
bility mass function is n(/). The rates of partnership creation and dissolution 
depend now on the / of each agent: p(ki, fi, kj , fj) and p{ki, fi, kj , fj). The 
population can be divided in subpopulations with a common value of /, with a 
degree distribution P{k\f) given by Eq. ([S]). One important difference with the 
model analyzed in the previous sections is that the time average of the number 
of partners is not the same for all individuals, but depends on their fitness. 
The interaction between the subpopulations is encoded in the self consistency 
parameters Xi{f), calculated from 

, .X ^ (fc) E; Pjh /; I, f2)P{l; xq,--- , a^jvl/2) ... 

It is also possible to obtain the distribution of accumulated contacts. In 
this case PT{m\n, /) is the probability that an individual with fitness /, having 
n partners at the beginning of a given time period of duration T, has had m 
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Figure 4: Fraction of tlie population that is infected, in the equihbrium state, 
as a function of the infectivity of the disease, for several values of the infectious 
period. The upper panel shows the theoretical curves obtained for the NATSAL 
model. The lower panel shows the results of simulations of populations of 10000 
individuals, using the same paramters as for the NATSAL model. Symbols 
correspond to averages over 100 runs. The lines joining the sysmbols are only 
guides to the eye. 
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partners at the end of that period. There is now a set of equations for each /, 
analog to Eqs. (|9]), that can be solved independently of each other. The degree 
distribution for the period T is Prim) = J2f J2^=o PTimln, f)P{n)Il{f). 

The analysis of the spread of an infectious disease can be carried out much 
in the same way as in the previous section. The mean field approach leads to 
an equation analog to Eq. (|10p . for each subpopulation. The probability that 
a partner of an individual is infected, 0, is again assumed to be independent of 
the individual, and is obtained by solving: 

{k(aI + Aor'kPf)f = ^ (16) 

where {kPf)j — j P{j\f), () / denote an average over the distribution n(/) and 
() denotes an average over both n(/) and P{n). The epidemic threshold in this 
approximation is 

Ac = ^ -> (17) 

(fc(aI + Ao)-ifcP)/ 

It is instructive to compare the cases where different fitness distributions 
generate the same instantaneous network. As expected, the static limit (t^ 0) 
does not depend on 7r(/). But the opposite limit does depend on the fitness: 

Ar(n(/)) = ^J^^, (18) 



{{Hf)) ) 



f 



where k{f) is the average of k over the individuals with the same value of /. 
If there is a nontrivial fitness distribution, it can be shown that this value is 
strictly smaller than A^ = 1/ (fc) , the limit found in the previous section. In 
other words, the effect of the social dynamics on the spread of the disease is less 
pronounced if the instantaneous network is (at least partly) generated by the 
features of the individuals. 

In STD epidemiology it is often assumed that there is a small group of 
individuals, usually called core group, whose contribution to the spread of the 
disease is disproportionately large. Even though there is some ambiguity in the 
exact characterization of it [3 3) , this label is frequently applied to people with 
very many sexual contacts |16j . Our result suggests that, even having the same 
number of individuals at any time, dynamic core groups (whose composition 
changes with time) might be not as effective as static ones in driving an epidemic. 

One potential drawback of including intrinsic features is that the computa- 
tional work needed to obtain the different predictions of the model is multiplied 
by the number of possible values of the fitness. It must be noted, however, that 
in sociological studies many features are quantified with a very small number 
of values. For example, income is usually quantified in quintiles or deciles, and 
physical attractiveness, because of its intrinsic ambiguity, has been quantified 
in many sociological studies in scales having between 5 and 10 values. 
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Another aspect of the model that ean be criticized is that, at any given 
time, any two individuals in the population can become sexual partners. This 
is not only geographically but also (and even more) socially not realistic. One 
way to overcome this limitation is to assume that each individual can only 
become a sexual partner of a fixed set of individuals, which form his or her 
'social neighbourhood'. Numerical simulations show that, for populations with 
neighbourhoods consisting of a few himdred individuals, results are almost in- 
distinguishable from the ones presented in the previous sections. 

6 Discussion and conclusion 

Most models that take social dynamics into account seem to belong to two 
groups. One group consists of models that are analytically solvable but are too 
schematic to account for many important features of real populations. The other 
group consists of models that are much more complex, with many parameters 
that can be obtained from population data, but whose very complexity implies 
that their study can only be carried out by means of computational simulations. 
The model presented here is an attempt at bridging the gap between these two 
groups. On the one hand, it is sufficiently general to allow its parameters to be 
obtained by fitting data from population surveys. The example analyzed shows 
that the fits obtained can be very reasonable. On the other hand, the model can 
be studied analytically using mean field techniques, which allows us to obtain 
some general results. 

We have found that, because of the interplay between the social and the 
epidemic dynamics, the relative epidemic threshold, as a function of the aver- 
age duration of infection, increases monotonically between the two limit cases, 
A° = {k)/{k'^) and = Thus, approximating the epidemic threshold 

by the static network threshold, entails an underestimation. And the example 
analizcd shows that, in real cases, this underestimation can be significant for 
diseases having an infectious period of the order of months. But, even in the 
case when is a good approximation, the problem that remains is how to 
estimate its value from survey data. Participants in surveys about sexual be- 
haviour are usually asked about number of partners during one or several time 
periods. Any properties of the instantaneous contact network must therefore 
be inferred from that information. Usually, A^ is estimated from the network 
built by considering the distribution of the number of accumulated partners as 
a degree distribution, for each time period. We have shown that, as is usually 
assumed, this approximation improves as shorter time periods are considered. 
Unfortunately, we have also shown that, in real cases, even the values obtained 
for rather short time periods (1 month) can be much smaller than A°. 

It is often assumed that to study the spread of diseases with short infectious 
periods the relevant information is encoded in the distribution of sexual part- 
ners for small time periods, whereas longer time periods (of the order of years) 
arc more relevant for diseases with long infectious periods. The results of the 
previous sections show that this might not be the case, at least for the epidemic 
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threshold. It is true that sometimes this threshold is well approximated by the 
static limit, whose estimation necessitates information about sexual partners in 
time periods as short as possible. But for diseases with long infectious periods, 
we find that the epidemic threshold obtained with distribution of partners for 
long time periods underestimates the static epidemic threshold, which in turn 
underestimates the real value. Therefore, for this kind of diseases, the best 
would be to to build a good social dynamics model by fitting the empirical data 
for several time periods, and to calculate its corresponding epidemic threshold. 

Dynamic models as the one presented here still need the addition of many 
features before being considered as reasonable representations of real popula- 
tions, such as the possibility of having asymptomatic individuals, and the divi- 
sion of the population into groups with different epidemiological characteristics. 
There is also room for improvement in the approximations used for the analysis 
of the model. One possibility is to go one step further from the mean field 
theory and to consider a pair approximation. It is not clear, however, whether 
such modifications would lead to a model amenable to analytical solutions or 
approximations, which is one of the main advantages of the model presented in 
this paper. 
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Appendix 



A Distribution of accumulated contacts 

By Laplace transforming Eqs. (9), solving, and back transforming, it can be 
shown that the probabilities that an individual has had m new contacts at the 
end of a time period of length T, given that he had n at the beginning of that 
period, are of the form: 

m m+n-i T^i,,- :/ l /< , + /<T; ) 

PTim\n)=Y: E AT^-^ (19) 

The constants A™" are obtained from the following recursions: 
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for i = 1, • • • , m — 1 and j = 0, ■ ■ ■ ,m + n — i — 1 

^rm+n-i('^n ~ Cm+„_i) = PnAH^^n-i iOT i = I, - ■ ■ , m — 1 

A^,"(c„-c,) = ria„A™J-i forj=0,.-. (20) 

The remaining constants are obtained from the conditions Po{m\n) = if 
m > and Po(0|7i) - 1: = "Er/™ ^oj"' ^oS = 1 " EjJo ^o", and 

Ag8 = 1. 



B Models A and B 

For models of the form p{ki, kj) — p{ki)p{kj) and cr(ki, kj) = a{ki)a{kj) the self 
consistency parameters are Xi = -^^x. x is obtained by solving 

''-^''^Y.M^,i)ip{i-xr ^'^> 

If now all the creation functions are multiplied by the same constant, C, and 
the self consistency parameter is rescaled as x' = Ax, Eq. [U becomes 

x' _ {k)j:^pizJ)P{l;x') ^ 

c- Ei^i^J)iPih^') ' ^ ' 

As mentioned in the text, models A and B are defined as follows. Model 
A: p{k) = CAk^/{k + 1)2 (for k > 0), p(0) = 1, and a{k) = 1. Model B: 
p{k) = Csk^/ik + 1) (for k > 0), p{0) = 1, and cr(fc) = k. Ca and C'b are 
numerical constants. The instantaneous distribution is P{k) — PoDkX^ /k^, 
where Dk = 11^=1(1 ^ */^)- FiglS] shows f{x') and x'/C, for different values 
of the constant C, for N = 10000. At A w 1.35 there is a discontinuous phase 
transition from a network with a;' w 1 to a network with x' = 0{N). 



C NATSAL model 

To obtain a model that fits the NATSAL data we have taken into account the 
fact that the number of respondents was rather small and, as a consequence, the 
sampling error for the number of repondents declaring having had more than 
two partners is likely to be rather large. We have chosen to adjust separately 
only the values of p{0), p(l), cr(l), and ct(2) to fit the number of respondents 
that reported or 1 partner. The rest of the data were fitted using the generic 
functions p{i) ~ Cpi^ /{i + 1)'^"^ and a{i) = C^i^ ■ The fits were performed 
sequentially. In the first step we fitted p(0), p(l), o'(l), and ct(2) using the 
analytic expressions for Pt(0) and Pt(1)- In the second step, a coarse sampling 
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Figure 5: f{x') as a function of x' for model A (see text). 



of parameter space was performed, in order to select a suitable region on which to 
focus. This selection was performed by calculating several different distributions 
Prik) for relatively small values of k (w 50) (which takes only a few seconds 
of computation time) and choosing the one that best fitted the data. In the 
third step, a fine tuning of the parameters found was performed by generating 
some 'full' distributions (up to A: = 100) (which takes tipically a couple of days 
of computation time) for small displacements from the parameters selected in 
the previous step. The values obtained for the different parameters are given in 
Table [H 

To compensate for the heaping present in the number of partners reported 
(i.e. the preference of respondents for round numbers, specially for large num- 
bers), we have applied geometric binning to the data. Nevertheless, the fits 
obtained are quite good for other presentations of the data, as the cumulative 
numbers of partners (see Fig. 2 in the main article) 

The estimates of the static epidemic threshold shown in the inset of Figs. 
1 and 2 in the main text were calculated using the accumulated partners dis- 
tributions found, i.e. up to fc = 100. Therefore the values are not exact, but 
it can be shown that they are upper bounds to the values calculated using the 
full distributions. This means that the difference between the exact estimations 
and the static threshold is even larger than what is shown in the insets. 
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Table 1: Parameters used for the NATSAL model 



Pq 0.0373 

pi 0.0459 

fTi 0.0348 

(72 0.0842 
P 2.7 

7 1 

Cp 0.0229 

0.132 



D Properties of the epidemic threshold 

Using Eq. 5 of the main text, the elements of matrix Ag can be written as 
(^o)«+i = -ic^i, (Ao)ij = P(,i-i) + ^ and (Ao)ii_i = - p[^I^j (i - 

l)cri-i. If we define a diagonal matrix Dp such that {Dp)ii — P{i), it is 
straightforward to see that Aq can be written as Aq — ApDp^, where Aq is a 
symmetric, tridiagonal matrix, with vanishing row (and column sums), defined 
by (Aq)^^^! — —iP{i)(Ji. Therefore, Gershgorin theorem implies that Aq is 
positive-definite. That is, it has the property that 



x^A'qX > (23) 

for any vector x. Using x = Dp^^(q;I + Ao)^^Dpfc in Eq. [23] and using the 
definition of Ac (Eq. 12) it can be shown that 

dX ^ 

= kial + Ao)"^Ao(aI + Ao)"^Dpfc = x'AqDpx > (24) 

oti 

We can also show that the growth of Ac is not unbounded, by calculating 
hmt-^oo Ac — liniQ^o Ac- For this, we need to calculate the limit of (Ao + al)"^. 
Note that it can be written as (Aq + al)~^ = adj((Ao + oil))/ det((Ao + al))- 
The adjoint of a matrix A is defined as (adjA)y = (— l)*+-'My , where My are 
the minors of A, i.e. Mij is the determinant of the matrix obtained by deleting 
row i and column j from A. 

The minors of Aq can be written as My = Mj'^-P(j)/ det(Dp), where 
are the minors of Aq . But the fact that all row and column sums vanish implies 
that Mj'j = (— l)'+-' A/{j^. It also implies that the determinant of Aq + al can 
be calculated by replacing each element of its first row by a. Using the Laplace 
expansion for the determinant, we then get 

det(Ao + al) = (-l)i+-'Mij + 0{a^) = a-^^ + ©(a^) (25) 
— ^ det Up 

3 

Using now that (adjAo).y — MiiP(j)/ det Dp, we obtain 
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lim a-i((Ao + aI)-% = P{j) (26) 

a— 

Replacing now this expression in Eq. 12 leads to X°° = limo,_,.o Ac = l/(fc)- 
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